EW-9372 EW-9455 [o11y] Report SpanOpen event earlier#5370
EW-9372 EW-9455 [o11y] Report SpanOpen event earlier#5370
Conversation
20b779f to
b6452a3
Compare
Merging this PR will not alter performance
Comparing Footnotes
|
|
Tests fail, but that just might be from being out of sync or ontop of something old? A short PR description would help. |
|
Closing this for now – delivering SpanOpen earlier would make it more difficult to implement renaming spans, which we may support in the future. |
|
We should definitely reopen this and send SpanOpen events when the spans open and not when they close. |
7db22e8 to
c73e84d
Compare
6d91a06 to
538c76c
Compare
|
The generated output of |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5370 +/- ##
==========================================
- Coverage 70.35% 70.30% -0.06%
==========================================
Files 408 408
Lines 108651 108735 +84
Branches 17991 18007 +16
==========================================
+ Hits 76444 76448 +4
- Misses 21409 21482 +73
- Partials 10798 10805 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This PR serves to perform two long-standing cleanup tasks in the STW implementation: 1) Sending the SpanOpen event as soon as a span is opened instead of when it closes 2) Getting rid of the CompleteSpan struct, which represents a full span but is something that won't be needed once SpanOpen is handled separately. To implement this in a backwards-compatible way, we need to land it in two parts so that the old code path and the new code path are both supported until we have phased out the old version which doesn't have the APIs for handling SpanOpen separately.
538c76c to
c5c817f
Compare
The coverage percentage appears lower than it should be here as some functions are only used upstream/not used at all until the follow-up PR. I'm convinced that our coverage doesn't actually get worse when taking that into account. |
c5c817f to
cc51938
Compare
|
|
||
| // helper method for addSpan() implementations | ||
| void adjustSpanTime(tracing::CompleteSpan& span); | ||
| void adjustSpanTime(tracing::SpanEndData& span); |
There was a problem hiding this comment.
These will be deduplicated in the follow-up PR
| // | ||
| // This should always be called exactly once per observer. | ||
| // This should always be called exactly once per observer at span completion time. | ||
| virtual void report(const Span& span) = 0; |
There was a problem hiding this comment.
For user tracing, report() will be used to only transmit the span end data in the follow-up. For internal tracing, this will continue to be used to transmit the full span. reportStart() is not yet used and will only used in the user tracing framework (no-op otherwise).
| kj::HashMap<kj::ConstString, tracing::Attribute::Value> tags; | ||
|
|
||
| // Convert CompleteSpan to SpanEndData | ||
| explicit SpanEndData(CompleteSpan&& span); |
There was a problem hiding this comment.
This will also no longer be needed with the follow-up.
| } | ||
| } | ||
|
|
||
| void BaseTracer::adjustSpanTime(tracing::SpanEndData& span) { |
There was a problem hiding this comment.
This is copied from BaseTracer::adjustSpanTime(tracing::CompleteSpan&), will be deduplicated in the follow-up PR. Note that the operationName variable is no longer available, but if we need to debug an error here we can still infer the operation using the spanId, which maps to the span's SpanOpen event which has the operationName.
| # Representation of an event that indicates completion of a user span. This information is | ||
| # provided to the streaming tail worker in the Attributes and SpanClose events. | ||
|
|
||
| # TODO(cleanup): startTimeNs is merely used as a fallback timestamp, consider obsoleting it. |
There was a problem hiding this comment.
This is only used as a fallback timestamp in the case of errors, but @mar-cf urged me to still support it.
| } | ||
| } | ||
|
|
||
| void SpanEndData::copyTo(rpc::SpanEndData::Builder builder) const { |
There was a problem hiding this comment.
You'll notice that this is also duplicated with CompleteSpan, but that will go away in the follow-up as discussed.
Purpose:
This PR serves to perform two long-standing cleanup tasks in the STW implementation:
To implement this in a backwards-compatible way, we need to land it in two parts so that the old code path and the new code path are both supported until we have phased out the old version which doesn't have the APIs for handling SpanOpen separately.
For code that is workerd-only and thus never involved in RPC or that is solely on the RPC server side, we can already decompose function calls so that we don't need to implement the sape functionality twice. This needs to land alongside a downstream PR. A follow-up PR will actually invoke the code path to send SpanOpen first, get rid of CompleteSpan struct and perform a bunch of cleanup – see #6051.
Note that: